Data Visualization

HiClass: a Python Library for Local Hierarchical Classification Compatible with Scikit-learn

Updated: 2023-03-31 16:24:21

: Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us HiClass : a Python Library for Local Hierarchical Classification Compatible with Scikit-learn Fábio M . Miranda , Niklas Köhnecke , Bernhard Y . Renard 24(29 1 17, 2023. Abstract HiClass is an open-source Python library for local hierarchical classification entirely compatible with scikit-learn . It contains implementations of the most common design patterns for hierarchical machine learning models found in the literature , that is , the local classifiers per node , per parent node and per level . Additionally , the package contains implementations of hierarchical metrics , which are more appropriate for

The SKIM-FA Kernel: High-Dimensional Variable Selection and Nonlinear Interaction Discovery in Linear Time

Updated: 2023-03-31 16:24:21

: Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us The SKIM-FA Kernel : High-Dimensional Variable Selection and Nonlinear Interaction Discovery in Linear Time Raj Agrawal , Tamara Broderick 24(27 1 60, 2023. Abstract Many scientific problems require identifying a small set of covariates that are associated with a target response and estimating their effects . Often , these effects are nonlinear and include interactions , so linear and additive methods can lead to poor estimation and variable selection . Unfortunately , methods that simultaneously express sparsity , nonlinearity , and interactions are computationally intractable with runtime at least

Generalization Bounds for Noisy Iterative Algorithms Using Properties of Additive Noise Channels

Updated: 2023-03-31 16:24:21

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Generalization Bounds for Noisy Iterative Algorithms Using Properties of Additive Noise Channels Hao Wang , Rui Gao , Flavio P . Calmon 24(26 1 43, 2023. Abstract Machine learning models trained by different optimization algorithms under different data distributions can exhibit distinct generalization behaviors . In this paper , we analyze the generalization of models trained by noisy iterative algorithms . We derive distribution-dependent generalization bounds by connecting noisy iterative algorithms to additive noise channels found in communication and information theory . Our generalization bounds shed

Discrete Variational Calculus for Accelerated Optimization

Updated: 2023-03-31 16:24:21

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Discrete Variational Calculus for Accelerated Optimization Cédric M . Campos , Alejandro Mahillo , David Martín de Diego 24(25 1 33, 2023. Abstract Many of the new developments in machine learning are connected with gradient-based optimization methods . Recently , these methods have been studied using a variational perspective Betancourt et al . 2018 This has opened up the possibility of introducing variational and symplectic methods using geometric integration . In particular , in this paper , we introduce variational integrators Marsden and West , 2001 which allow us to derive different methods for

Calibrated Multiple-Output Quantile Regression with Representation Learning

Updated: 2023-03-31 16:24:21

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Calibrated Multiple-Output Quantile Regression with Representation Learning Shai Feldman , Stephen Bates , Yaniv Romano 24(24 1 48, 2023. Abstract We develop a method to generate predictive regions that cover a multivariate response variable with a user-specified probability . Our work is composed of two components . First , we use a deep generative model to learn a representation of the response that has a unimodal distribution . Existing multiple-output quantile regression approaches are effective in such cases , so we apply them on the learned representation , and then transform the solution to the original

Bayesian Data Selection

Updated: 2023-03-31 16:24:21

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Bayesian Data Selection Eli N . Weinstein , Jeffrey W . Miller 24(23 1 72, 2023. Abstract Insights into complex , high-dimensional data can be obtained by discovering features of the data that match or do not match a model of interest . To formalize this task , we introduce the data selection problem : finding a lower-dimensional statistic such as a subset of variables that is well fit by a given parametric model of interest . A fully Bayesian approach to data selection would be to parametrically model the value of the statistic , nonparametrically model the remaining background components of the data , and

Graph-Aided Online Multi-Kernel Learning

Updated: 2023-03-31 16:24:21

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Graph-Aided Online Multi-Kernel Learning Pouya M . Ghari , Yanning Shen 24(21 1 44, 2023. Abstract Multi-kernel learning MKL has been widely used in learning problems involving function learning tasks . Compared with single kernel learning approach which relies on a pre-selected kernel , the advantage of MKL is its flexibility results from combining a dictionary of kernels . However , inclusion of irrelevant kernels in the dictionary may deteriorate the accuracy of MKL , and increase the computational complexity . Faced with this challenge , a novel graph-aided framework is developed to select a subset of

Regularized Joint Mixture Models

Updated: 2023-03-31 16:24:21

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Regularized Joint Mixture Models Konstantinos Perrakis , Thomas Lartigue , Frank Dondelinger , Sach Mukherjee 24(19 1 47, 2023. Abstract Regularized regression models are well studied and , under appropriate conditions , offer fast and statistically interpretable results . However , large data in many applications are heterogeneous in the sense of harboring distributional differences between latent groups . Then , the assumption that the conditional distribution of response Y$ given features X$ is the same for all samples may not hold . Furthermore , in scientific applications , the covariance structure of the

Globally-Consistent Rule-Based Summary-Explanations for Machine Learning Models: Application to Credit-Risk Evaluation

Updated: 2023-03-31 16:24:21

: Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Globally-Consistent Rule-Based Summary-Explanations for Machine Learning Models : Application to Credit-Risk Evaluation Cynthia Rudin , Yaron Shaposhnik 24(16 1 44, 2023. Abstract We develop a method for understanding specific predictions made by global predictive models by constructing local models tailored to each specific observation these are also called explanations in the literature Unlike existing work that explains” specific observations by approximating global models in the vicinity of these observations , we fit models that are globally-consistent with predictions made by the global model on past

Python package for causal discovery based on LiNGAM

Updated: 2023-03-31 16:24:21

Causal discovery is a methodology for learning causal graphs from data, and LiNGAM is a well-known model for causal discovery. This paper describes an open-source Python package for causal discovery based on LiNGAM. The package implements various LiNGAM methods under different settings like time series cases, multiple-group cases, mixed data cases, and hidden common cause cases, in addition to evaluation of statistical reliability and model assumptions. The source code is freely available under the MIT license at https://github.com/cdt15/lingam.

Sampling random graph homomorphisms and applications to network data analysis

Updated: 2023-03-31 16:24:21

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Sampling random graph homomorphisms and applications to network data analysis Hanbaek Lyu , Facundo Memoli , David Sivakoff 24(9 1 79, 2023. Abstract A graph homomorphism is a map between two graphs that preserves adjacency relations . We consider the problem of sampling a random graph homomorphism from a graph into a large network . We propose two complementary MCMC algorithms for sampling random graph homomorphisms and establish bounds on their mixing times and the concentration of their time averages . Based on our sampling algorithms , we propose a novel framework for network data analysis that circumvents

AutoKeras: An AutoML Library for Deep Learning

Updated: 2023-03-31 16:24:21

: Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us AutoKeras : An AutoML Library for Deep Learning Haifeng Jin , François Chollet , Qingquan Song , Xia Hu 24(6 1 6, 2023. Abstract To use deep learning , one needs to be familiar with various software tools like TensorFlow or Keras , as well as various model architecture and optimization best practices . Despite recent progress in software usability , deep learning remains a highly specialized occupation . To enable people with limited machine learning and programming experience to adopt deep learning , we developed AutoKeras , an Automated Machine Learning AutoML library that automates the process of model

Cluster-Specific Predictions with Multi-Task Gaussian Processes

Updated: 2023-03-31 16:24:21

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Cluster-Specific Predictions with Multi-Task Gaussian Processes Arthur Leroy , Pierre Latouche , Benjamin Guedj , Servane Gey 24(5 1 49, 2023. Abstract A model involving Gaussian processes GPs is introduced to simultaneously handle multitask learning , clustering , and prediction for multiple functional data . This procedure acts as a model-based clustering method for functional data as well as a learning step for subsequent predictions for new tasks . The model is instantiated as a mixture of multi-task GPs with common mean processes . A variational EM algorithm is derived for dealing with the optimisation of

Efficient Structure-preserving Support Tensor Train Machine

Updated: 2023-03-31 16:24:21

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Efficient Structure-preserving Support Tensor Train Machine Kirandeep Kour , Sergey Dolgov , Martin Stoll , Peter Benner 24(4 1 22, 2023. Abstract An increasing amount of the collected data are high-dimensional multi-way arrays tensors and it is crucial for efficient learning algorithms to exploit this tensorial structure as much as possible . The ever present curse of dimensionality for high dimensional data and the loss of structure when vectorizing the data motivates the use of tailored low-rank tensor classification methods . In the presence of small amounts of training data , kernel methods offer an

Bayesian Spiked Laplacian Graphs

Updated: 2023-03-31 16:24:21

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Bayesian Spiked Laplacian Graphs Leo L Duan , George Michailidis , Mingzhou Ding 24(3 1 35, 2023. Abstract In network analysis , it is common to work with a collection of graphs that exhibit heterogeneity . For example , neuroimaging data from patient cohorts are increasingly available . A critical analytical task is to identify communities , and graph Laplacian-based methods are routinely used . However , these methods are currently limited to a single network and also do not provide measures of uncertainty on the community assignment . In this work , we first propose a probabilistic network model called the

Approximation Bounds for Hierarchical Clustering: Average Linkage, Bisecting K-means, and Local Search

Updated: 2023-03-31 16:24:21

: , , Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Approximation Bounds for Hierarchical Clustering : Average Linkage , Bisecting K-means , and Local Search Benjamin Moseley , Joshua R . Wang 24(1 1 36, 2023. Abstract Hierarchical clustering is a data analysis method that has been used for decades . Despite its widespread use , the method has an underdeveloped analytical foundation . Having a well-understood foundation would both support the currently used methods and help guide future improvements . The goal of this paper is to give an analytic framework to better understand observations seen in practice . This paper considers the dual of a problem

The d-Separation Criterion in Categorical Probability

Updated: 2023-03-31 16:24:21

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us The d-Separation Criterion in Categorical Probability Tobias Fritz , Andreas Klingler 24(46 1 49, 2023. Abstract The d-separation criterion detects the compatibility of a joint probability distribution with a directed acyclic graph through certain conditional independences . In this work , we study this problem in the context of categorical probability theory by introducing a categorical definition of causal models , a categorical notion of d-separation , and proving an abstract version of the d-separation criterion . This approach has two main benefits . First , categorical d-separation is a very intuitive

Robust Load Balancing with Machine Learned Advice

Updated: 2023-03-31 16:24:21

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Robust Load Balancing with Machine Learned Advice Sara Ahmadian , Hossein Esfandiari , Vahab Mirrokni , Binghui Peng 24(44 1 46, 2023. Abstract Motivated by the exploding growth of web-based services and the importance of efficiently managing the computational resources of such systems , we introduce and study a theoretical model for load balancing of very large databases such as commercial search engines . Our model is a more realistic version of the well-received bab model with an additional constraint that limits the number of servers that carry each piece of the data . This additional constraint is

Benchmarking Graph Neural Networks

Updated: 2023-03-31 16:24:21

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Benchmarking Graph Neural Networks Vijay Prakash Dwivedi , Chaitanya K . Joshi , Anh Tuan Luu , Thomas Laurent , Yoshua Bengio , Xavier Bresson 24(43 1 48, 2023. Abstract In the last few years , graph neural networks GNNs have become the standard toolkit for analyzing and learning from data on graphs . This emerging field has witnessed an extensive growth of promising techniques that have been applied with success to computer science , mathematics , biology , physics and chemistry . But for any successful field to become mainstream and reliable , benchmarks must be developed to quantify progress . This led us

Neural Implicit Flow: a mesh-agnostic dimensionality reduction paradigm of spatio-temporal data

Updated: 2023-03-31 16:24:21

: Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Neural Implicit Flow : a mesh-agnostic dimensionality reduction paradigm of spatio-temporal data Shaowu Pan , Steven L . Brunton , J . Nathan Kutz 24(41 1 60, 2023. Abstract High-dimensional spatio-temporal dynamics can often be encoded in a low-dimensional subspace . Engineering applications for modeling , characterization , design , and control of such large-scale systems often rely on dimensionality reduction to make solutions computationally tractable in real time . Common existing paradigms for dimensionality reduction include linear methods , such as the singular value decomposition SVD and nonlinear

Label Distribution Changing Learning with Sample Space Expanding

Updated: 2023-03-31 16:24:21

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Label Distribution Changing Learning with Sample Space Expanding Chao Xu , Hong Tao , Jing Zhang , Dewen Hu , Chenping Hou 24(36 1 48, 2023. Abstract With the evolution of data collection ways , label ambiguity has arisen from various applications . How to reduce its uncertainty and leverage its effectiveness is still a challenging task . As two types of representative label ambiguities , Label Distribution Learning LDL which annotates each instance with a label distribution , and Emerging New Class ENC which focuses on model reusing with new classes , have attached extensive attentions . Nevertheless , in

Gap Minimization for Knowledge Sharing and Transfer

Updated: 2023-03-31 16:24:21

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Gap Minimization for Knowledge Sharing and Transfer Boyu Wang , Jorge A . Mendez , Changjian Shui , Fan Zhou , Di Wu , Gezheng Xu , Christian Gagné , Eric Eaton 24(33 1 57, 2023. Abstract Learning from multiple related tasks by knowledge sharing and transfer has become increasingly relevant over the last two decades . In order to successfully transfer information from one task to another , it is critical to understand the similarities and differences between the domains . In this paper , we introduce the notion of performance gap , an intuitive and novel measure of the distance between learning tasks . Unlike

Sparse PCA: a Geometric Approach

Updated: 2023-03-31 16:24:21

: Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Sparse PCA : a Geometric Approach Dimitris Bertsimas , Driss Lahlou Kitane 24(32 1 33, 2023. Abstract We consider the problem of maximizing the variance explained from a data matrix using orthogonal sparse principal components that have a support of fixed cardinality . While most existing methods focus on building principal components PCs iteratively through deflation , we propose GeoSPCA , a novel algorithm to build all PCs at once while satisfying the orthogonality constraints which brings substantial benefits over deflation . This novel approach is based on the left eigenvalues of the covariance matrix

Labels, Information, and Computation: Efficient Learning Using Sufficient Labels

Updated: 2023-03-31 16:24:21

, , : Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Labels , Information , and Computation : Efficient Learning Using Sufficient Labels Shiyu Duan , Spencer Chang , Jose C . Principe 24(31 1 35, 2023. Abstract In supervised learning , obtaining a large set of fully-labeled training data is expensive . We show that we do not always need full label information on every single training example to train a competent classifier . Specifically , inspired by the principle of sufficiency in statistics , we present a statistic a summary of the fully-labeled training set that captures almost all the relevant information for classification but at the same time is

✚ Visualization Tools and Learning Resources – March 2023 Roundup

Updated: 2023-03-30 18:30:51

Membership Courses Tutorials Projects Newsletter Become a Member Log in Members Only Visualization Tools and Learning Resources â March 2023 Roundup March 30, 2023 Topic The Process roundup Welcome to issue 232 of The Process , where we look closer at how the charts get made . I’m Nathan Yau , and every month I collect visualization tools and resources to help make better charts . This is the good stuff for . March To access this issue of The Process , you must be a . member If you are already a member , log in here See What You Get The Process is a weekly newsletter where I evaluate how visualization tools , rules , and guidelines work in practice . I publish every Thursday . Get it in your inbox or access it via the site . You also gain unlimited access to hundreds of hours worth of

AI for data storytelling

Updated: 2023-03-28 21:39:35

Artificial Intelligence is already being used in data journalism. For a field which is obsessed about trying to automate tedious tasks, AI is custom made. Data storytelling and journalism have always been at the forefront of technology, first to adopt the newest gadgets and techniques. When VR devices launched, data journalists at the WSJ designed … Continue reading →

✚ Narrow Audience

Updated: 2023-03-23 18:30:46

Membership Courses Tutorials Projects Newsletter Become a Member Log in Members Only Narrow Audience March 23, 2023 Topic The Process audience Welcome to issue 231 of The Process where we look closer at how the charts get made . I’m Nathan Yau , and this week I’m thinking about communicating data to as few people as . possible To access this issue of The Process , you must be a . member If you are already a member , log in here See What You Get The Process is a weekly newsletter where I evaluate how visualization tools , rules , and guidelines work in practice . I publish every Thursday . Get it in your inbox or access it via the site . You also gain unlimited access to hundreds of hours worth of step-by-step visualization courses and tutorials which will help you make sense of data for

Creating Interactive Flow Maps with JavaScript

Updated: 2023-03-22 16:17:14

Flow maps are a powerful way to represent the movement of objects between different geographic locations, and with JavaScript, creating an interactive flow map is easier than you might think. Flow maps combine the functionality of a map and a flow diagram; this type of visualization shows the direction of movement of people, goods, money, […] The post Creating Interactive Flow Maps with JavaScript appeared first on AnyChart News.

Rani Molla: data for journalism

Updated: 2023-03-20 19:02:19

New episode of the Data Journalism Podcast out today with an interview with Rani Molla. Rani is a senior correspondent at Vox Media, reporting for Recode on the intersection between work, technology and the future. She uses data to tell stories every day, whether it’s about our return to the office (or lack of), the impacts of … Continue reading →

How to Create Box-and-Whisker Plot in JavaScript

Updated: 2023-03-16 15:17:19

Transform your data into insights with a stunning box-and-whisker plot! Learn what it is and how to create one with ease using JavaScript. In this tutorial, I’ll walk you through the steps to make a neat and visually appealing JS-based (HTML5) box chart with the yearly gross salaries of different IT professions in Europe. Unlock […] The post How to Create Box-and-Whisker Plot in JavaScript appeared first on AnyChart News.

Florence Nightingale and the history of dataviz

Updated: 2023-03-13 19:13:00

RJ Andrews is the founder of data design studio Info We Trust and author of a new series of books delving into the deep history of of data visualisation and storytelling. In this episode of the pod, he talks about three significant parts of the history of data visuals: Florence Nightingale, Emma Willard and Étienne-Jules Marey. While Nightingale … Continue reading →

Italian Gruppo Astrofili Galileo Galilei Uses AnyChart JS Charts for Astronomical Data Visualization

Updated: 2023-03-09 09:05:19

Sales : 1 888 845-1211 USA or 44 20 7193 9444 Europe customer login Toggle navigation Products AnyChart AnyStock AnyMap AnyGantt Mobile Qlik Extension Features Resources Business Solutions Technical Integrations Chartopedia Tutorials Support Company About Us Customers Success Stories More Testimonials News Download Buy Now Search News Â» Success stories Â» Italian Gruppo Astrofili Galileo Galilei Uses AnyChart JS Charts for Astronomical Data Visualization Italian Gruppo Astrofili Galileo Galilei Uses AnyChart JS Charts for Astronomical Data Visualization March 9th , 2023 by AnyChart Team We are glad to continue supporting non-profit initiatives all over the world by allowing them to use our JavaScript charting library free of charge . Recently , Giorgio Mazzacurati of Gruppo Astrofili

Data Visualization

Exploring ways to display data

Current Feed Items | Previous Months ItemsFeb 2023 | Jan 2023 | Dec 2022 | Nov 2022

Current Feed Items | Previous Months Items

Get Feed

Sources

24 - JMLR

3 - Simon Rogers

3 - AnyChart News

2 - FlowingData

Current Feed Items | Previous Months Items
Feb 2023 | Jan 2023 | Dec 2022 | Nov 2022